Fig 1: Prognostic signature in saliva distinguishes OSCC patients. a Workflow for machine-learning approach to measure the predictive power of peptides and proteins. b, c The predictive relevance of individual proteins and peptides to distinguish N0 from N+ patients is represented by a bar chart indicating their cross-validation ROC AUC (100 repetitions of stratified tenfold cross-validation). The most relevant protein and peptide ordered by the AUC is LTA4H and Pep8_LTA4H, respectively. When only the AUCs of the individual signatures (size 1) are considered, the three highest areas at the protein level are LTA4H (73.9%), COL6A1 (62.1%), and ITGAV (60.5%) and at the peptide level are Pep12_CSTB (73.5%), Pep8_LTA4H (72.8%), and Pep9_COL6A1 (71.0%). d Cross-validation estimated ROC curves of the best protein and peptide signatures. e Box plots representing the AUC of all possibilities of signatures for both imbalanced and balanced (SMOTE) cross-validation. At the peptide level, 1024 signatures were tested. At the protein level, 63 signatures were tested. Signatures formed by peptides from different proteins S1 {Pep8, Pep12} and S2 {Pep8, Pep9, Pep12} have approximately 10.5% higher AUC than the peptide signature formed by LTA4H (S4). S2 peptide signature outperformed both S1 and S4 signatures. The candidate signatures are indicated by labels: S1, S2, S3, and S4. Peptide sequences: Pep1_MB: HGATVLTALGGILK; Pep2_MB: YLEFISECIIQVLQSK; Pep3_PGK1: VLNNMEIGTSLFDEEGAK; Pep4_PGK1: VLPGVDALSNI; Pep5_ITGAV: LQEVGQVSVSLQR; Pep6_ITGAV: STGLNAVPSQILEGQWAAR; Pep7_LTA4H: LTYTAEVSVPK; Pep8_LTA4H: DLSSHQLNEFLAQTLQR; Pep9_COL6A1: GLEQLLVGGSHLK; Pep10_COL6A1: TAEYDVAYGESHLFR; Pep11_NDRG1: EMQDVDLAEVKPLVEK; Pep12_CSTB: HDELTYF; Pep13_CSTB: SQVVAGTNYFIK; and Pep14_CSTB: VHVGDEDFVHLR. Four peptides were not included in the training model because they did not pass the filtering step (step 2 from Part 2 of Fig. 7a; P value < 0.1, Mann-Whitney U test). Box plots represent the median and interquartile range, whiskers represent the 1–99 percentile, and outliers are represented by “+”
Fig 2: Kaplan-Meier survival analysis of IHC and clinical outcomes. Overall survival (OS), disease-free survival (DFS) and specific survival (SS) were available in relation to a second primary tumor, local, locoregional or lymph node relapse. a Patients with lower CSTB expression in the ITF had a higher risk of local relapse and worse survival (P value < 0.05, log-rank test). In addition, a lower NDRG1 expression in the ITF was associated with a higher risk of the patient presenting a second primary tumor and a worse DFS (P value < 0.05, log-rank test). Equal expression between the ITF and the inner tumor or higher CSTB and NDRG1 expression in the ITF did not influence the local relapse or second primary tumor. Patients with higher PGK1 expression in the ITF had a worse survival and early locoregional recurrence (P value < 0.05, log-rank test). b Patients with higher expression of ITGAV in the ITF have a greater risk to present lymph node metastasis relapse and poor survival (P value < 0.05, log-rank test)
Fig 3: Immunohistochemical staining of targeted proteins. Oral SCC tissue samples from a set of 125 cases were used to verify the abundance of a CSTB, LTA4H, NDRG1, and PGK1 neoplastic island proteins, and 96 cases were used to verify the abundance of b COL6A1, ITGAV, and MB tumor stroma proteins. Among the cases, it was possible to analyze 114 cases for CSTB, 118 cases for LTA4H, 119 cases for NDRG1, 118 cases for PGK1, 93 cases for COL6A1, 80 cases for ITGAV, and 86 cases for MB. The scores that represent the sum of the intensities and percentage of protein staining in the ITF or the inner tumor are shown as a heat map. Differential staining in ITF and inner tumor in both neoplastic islands and tumor stroma was identified, in agreement with the MS discovery analysis however, we also identified negative cases for each protein and cases with gradual staining according to the color key shown at the top right. For proteins selected in neoplastic island, increased CSTB and NDRG1 expression was identified in the inner tumor, according to the MS results, with staining only inside neoplastic cells of OSCC. However, LTA4H and PGK1 presented peripheral staining in neoplastic cells and were also detected in cells from tumor stroma, such as inflammatory cells. For proteins selected in tumor stroma, greater COL6A1 and MB expression was identified in the ITF area, which demonstrates the preferential localization in OSCC regions compared with that observed in the MS discovery analysis. COL6A1, ITGAV, and MB proteins were preferentially present in tumor stroma (Histological images obtained using a ×40 objective, scale bars, 200 µm)
Fig 4: Targeted proteomics of saliva proteins. a, b Volcano analyses show log2 ratio of N+/N0 of a proteins and b peptides according to the adjusted P value. Proteins that met the indicated statistical cut-off criteria (Mann-Whitney U test, with P values adjusted for multiple comparisons using the Benjamini-Hochberg FDR method, adjusted P value < 0.05) are colored in red. c The graph demonstrates individually the L/H intensity ratio (not log transformed) of six differentially expressed proteins CSTB, COL6A1, ITGAV, LTA4H, PGK1, and NDRG1 between N+ and N0 saliva samples. *P value < 0.05, Mann-Whitney U test. d Peptide relative quantification (log2 L/H ratio) between N+ and N0 saliva samples. For each protein, 2-3 proteotypic peptides were monitored, with exception for NDRG1, only one proteotypic peptide was monitored. The light peptide (corresponding to the endogenous peptide present in saliva) and the heavy peptide (which corresponds to the synthetic peptide spiked-in saliva) were monitored, and the light/heavy ratio for each of the 14 peptides was obtained by Skyline. Box plots represent the median and interquartile range, whiskers represent the 1–99 percentile, and outliers are represented by empty circles. e Bar plots represent the relationship between MS Discovery analysis of tissue and SRM-MS of saliva in the identification of potential prognostic signatures. The log2 N+/N0 ratio for saliva samples and log2 ITF/inner ratio for neoplastic islands and tumor stroma from microdissected tissues are represented in the graph. f Representative figure illustrates the gradient dynamics of the protein abundance between tissue (ITF and inner tumor) and saliva (N+ and N0), which indicates that the abundance of proteins in saliva and its association with prognosis (N+ and N0) is not necessarily associated with the proximity of the altered oral epithelium. Other components, such as water, electrolytes, DNA, RNA, and microorganisms, were not included. Images in f were adapted from files provided by Servier Medical Art (https://smart.servier.com/, licensed under a Creative Commons Attribution 3.0 Unported License)
Supplier Page from MilliporeSigma for Anti-ITGAV antibody produced in rabbit